Model Selection

Chinese Multimodal Understanding

# Chinese Multimodal Understanding

Chinese Clip Vit Large Patch14

Chinese CLIP model, based on VIT architecture, supports Chinese vision-language tasks

Image Classification

Mengzi Oscar Base

A Chinese multimodal pretraining model built on the Oscar framework, initialized with Mengzi-Bert base version, trained on 3.7 million image-text pairs.

Transformers Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase